AITopics | bipartite network

2512.01716

Country:

Europe > United Kingdom (0.04)
Africa > Kenya (0.04)
North America > Canada (0.04)
Europe > France (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Karki, Jony, Huang, Dongzhou, Zhao, Yunpeng

Variational Estimators for Node Popularity Models

arXiv.org Machine LearningNov-25-2025

Node popularity is recognized as a key factor in modeling real-world networks, capturing heterogeneity in connectivity across communities. This concept is equally important in bipartite networks, where nodes in different partitions may exhibit varying popularity patterns, motivating models such as the Two-Way Node Popularity Model (TNPM). Existing methods, such as the Two-Stage Divided Cosine (TSDC) algorithm, provide a scalable estimation approach but may have limitations in terms of accuracy or applicability across different types of networks. In this paper, we develop a computationally efficient and theoretically justified variational expectation-maximization (VEM) framework for the TNPM. We establish label consistency for the estimated community assignments produced by the proposed variational estimator in bipartite networks. Through extensive simulation studies, we show that our method achieves superior estimation accuracy across a range of bipartite as well as undirected networks compared to existing algorithms. Finally, we evaluate our method on real-world bipartite and undirected networks, further demonstrating its practical effectiveness and robustness.

algorithm, block model, matrix, (16 more...)

2511.17783

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Colorado (0.04)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Amadori, Eleonora, Cirulli, Daniele, Di Martino, Edoardo, Nudo, Jacopo, Sahakyan, Maria, Sangiorgio, Emanuele, Santoro, Arnaldo, Zollo, Simon, Galeazzi, Alessandro, Di Marco, Niccolò

Involvement drives complexity of language in online debates

arXiv.org Artificial IntelligenceJun-30-2025

Language is a fundamental aspect of human societies, continuously evolving in response to various stimuli, including societal changes and intercultural interactions. Technological advancements have profoundly transformed communication, with social media emerging as a pivotal force that merges entertainment-driven content with complex social dynamics. As these platforms reshape public discourse, analyzing the linguistic features of user-generated content is essential to understanding their broader societal impact. In this paper, we examine the linguistic complexity of content produced by influential users on Twitter across three globally significant and contested topics: COVID-19, COP26, and the Russia-Ukraine war. By combining multiple measures of textual complexity, we assess how language use varies along four key dimensions: account type, political leaning, content reliability, and sentiment. Our analysis reveals significant differences across all four axes, including variations in language complexity between individuals and organizations, between profiles with sided versus moderate political views, and between those associated with higher versus lower reliability scores. Additionally, profiles producing more negative and offensive content tend to use more complex language, with users sharing similar political stances and reliability levels converging toward a common jargon. Our findings offer new insights into the sociolinguistic dynamics of digital platforms and contribute to a deeper understanding of how language reflects ideological and social structures in online spaces.

large language model, machine learning, natural language, (20 more...)

2506.22098

Country:

Europe > Ukraine (0.37)
Asia > Russia (0.34)
Europe > Russia (0.25)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (1.00)
Media > News (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.37)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Cai, Erica, O'Connor, Brendan

Understanding the Effect of Knowledge Graph Extraction Error on Downstream Graph Analyses: A Case Study on Affiliation Graphs

arXiv.org Artificial IntelligenceJun-17-2025

Knowledge graphs (KGs) are useful for analyzing social structures, community dynamics, institutional memberships, and other complex relationships across domains from sociology to public health. While recent advances in large language models (LLMs) have improved the scalability and accessibility of automated KG extraction from large text corpora, the impacts of extraction errors on downstream analyses are poorly understood, especially for applied scientists who depend on accurate KGs for real-world insights. To address this gap, we conducted the first evaluation of KG extraction performance at two levels: (1) micro-level edge accuracy, which is consistent with standard NLP evaluations, and manual identification of common error sources; (2) macro-level graph metrics that assess structural properties such as community detection and connectivity, which are relevant to real-world applications. Focusing on affiliation graphs of person membership in organizations extracted from social register books, our study identifies a range of extraction performance where biases across most downstream graph analysis metrics are near zero. However, as extraction performance declines, we find that many metrics exhibit increasingly pronounced biases, with each metric tending toward a consistent direction of either over- or under-estimation. Through simulations, we further show that error models commonly used in the literature do not capture these bias patterns, indicating the need for more realistic error models for KG extraction. Our findings provide actionable insights for practitioners and underscores the importance of advancing extraction methods and error modeling to ensure reliable and meaningful downstream analyses.

data mining, large language model, machine learning, (22 more...)

2506.12367

Country:

Africa (1.00)
North America > United States > Colorado (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Health & Medicine > Therapeutic Area (0.93)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Yang, Hongyuan, Peng, Siqi, Yamamoto, Akihiro

BicliqueEncoder: An Efficient Method for Link Prediction in Bipartite Networks using Formal Concept Analysis and Transformer Encoder

arXiv.org Artificial IntelligenceMar-20-2025

We propose a novel and efficient method for link prediction in bipartite networks, using \textit{formal concept analysis} (FCA) and the Transformer encoder. Link prediction in bipartite networks finds practical applications in various domains such as product recommendation in online sales, and prediction of chemical-disease interaction in medical science. Since for link prediction, the topological structure of a network contains valuable information, many approaches focus on extracting structural features and then utilizing them for link prediction. Bi-cliques, as a type of structural feature of bipartite graphs, can be utilized for link prediction. Although several link prediction methods utilizing bi-cliques have been proposed and perform well in rather small datasets, all of them face challenges with scalability when dealing with large datasets since they demand substantial computational resources. This limits the practical utility of these approaches in real-world applications. To overcome the limitation, we introduce a novel approach employing iceberg concept lattices and the Transformer encoder. Our method requires fewer computational resources, making it suitable for large-scale datasets while maintaining high prediction performance. We conduct experiments on five large real-world datasets that exceed the capacity of previous bi-clique-based approaches to demonstrate the efficacy of our method. Additionally, we perform supplementary experiments on five small datasets to compare with the previous bi-clique-based methods for bipartite link prediction and demonstrate that our method is more efficient than the previous ones.

artificial intelligence, data mining, machine learning, (20 more...)

2503.07645

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.67)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Machine LearningDec-10-2024

Two-way Node Popularity Model for Directed and Bipartite Networks

Jing, Bing-Yi, Li, Ting, Wang, Jiangzhou, Wang, Ya

There has been extensive research on community detection in directed and bipartite networks. However, these studies often fail to consider the popularity of nodes in different communities, which is a common phenomenon in real-world networks. To address this issue, we propose a new probabilistic framework called the Two-Way Node Popularity Model (TNPM). The TNPM also accommodates edges from different distributions within a general sub-Gaussian family. We introduce the Delete-One-Method (DOM) for model fitting and community structure identification, and provide a comprehensive theoretical analysis with novel technical skills dealing with sub-Gaussian generalization. Additionally, we propose the Two-Stage Divided Cosine Algorithm (TSDC) to handle large-scale networks more efficiently. Our proposed methods offer multi-folded advantages in terms of estimation accuracy and computational efficiency, as demonstrated through extensive numerical studies. We apply our methods to two real-world applications, uncovering interesting findings.

adjacency matrix, algorithm, matrix, (14 more...)

2412.08051

Country:

North America > United States (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Data Science (0.68)

Özer, Şükrü Demir İnan, Orman, Günce Keziban, Labatut, Vincent

Link Prediction in Bipartite Networks

arXiv.org Artificial IntelligenceJun-10-2024

Bipartite networks serve as highly suitable models to represent systems involving interactions between two distinct types of entities, such as online dating platforms, job search services, or ecommerce websites. These models can be leveraged to tackle a number of tasks, including link prediction among the most useful ones, especially to design recommendation systems. However, if this task has garnered much interest when conducted on unipartite (i.e. standard) networks, it is far from being the case for bipartite ones. In this study, we address this gap by performing an experimental comparison of 19 link prediction methods able to handle bipartite graphs. Some come directly from the literature, and some are adapted by us from techniques originally designed for unipartite networks. We also propose to repurpose recommendation systems based on graph convolutional networks (GCN) as a novel link prediction solution for bipartite networks. To conduct our experiments, we constitute a benchmark of 3 real-world bipartite network datasets with various topologies. Our results indicate that GCN-based personalized recommendation systems, which have received significant attention in recent years, can produce successful results for link prediction in bipartite networks. Furthermore, purely heuristic metrics that do not rely on any learning process, like the Structural Perturbation Method (SPM), can also achieve success.

bipartite network, node, prediction, (14 more...)

2406.06658

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Anakok, Emre, Barbillon, Pierre, Fontaine, Colin, Thebault, Elisa

Bipartite Graph Variational Auto-Encoder with Fair Latent Representation to Account for Sampling Bias in Ecological Networks

arXiv.org Machine LearningMar-4-2024

We propose a method to represent bipartite networks using graph embeddings tailored to tackle the challenges of studying ecological networks, such as the ones linking plants and pollinators, where many covariates need to be accounted for, in particular to control for sampling bias. We adapt the variational graph auto-encoder approach to the bipartite case, which enables us to generate embeddings in a latent space where the two sets of nodes are positioned based on their probability of connection. We translate the fairness framework commonly considered in sociology in order to address sampling bias in ecology. By incorporating the Hilbert-Schmidt independence criterion (HSIC) as an additional penalty term in the loss we optimize, we ensure that the structure of the latent space is independent of continuous variables, which are related to the sampling process. Finally, we show how our approach can change our understanding of ecological networks when applied to the Spipoll data set, a citizen science monitoring program of plant-pollinator interactions to which many observers contribute, making it prone to sampling bias.

hsic, interaction, latent space, (17 more...)

2403.02011

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Belgium (0.04)
Asia > Macao (0.04)
Asia > China (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Peng, Siqi, Yang, Hongyuan, Yamamoto, Akihiro

BERT4FCA: A Method for Bipartite Link Prediction using Formal Concept Analysis and BERT

arXiv.org Artificial IntelligenceFeb-13-2024

We propose BERT4FCA, a novel method for link prediction in bipartite networks, using formal concept analysis (FCA) and BERT. Link prediction in bipartite networks is an important task that can solve various practical problems like friend recommendation in social networks and co-authorship prediction in author-paper networks. Recent research has found that in bipartite networks, maximal bi-cliques provide important information for link prediction, and they can be extracted by FCA. Some FCA-based bipartite link prediction methods have achieved good performance. However, we figured out that their performance could be further improved because these methods did not fully capture the rich information of the extracted maximal bi-cliques. To address this limitation, we propose an approach using BERT, which can learn more information from the maximal bi-cliques extracted by FCA and use them to make link prediction. We conduct experiments on three real-world bipartite networks and demonstrate that our method outperforms previous FCA-based methods, and some classic methods such as matrix-factorization and node2vec.

bipartite network, information, prediction, (14 more...)

2402.08236

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
Europe > Greece > Attica > Athens (0.04)
Europe > Russia > Volga Federal District > Nizhny Novgorod Oblast > Nizhny Novgorod (0.04)
Asia > Russia (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Yang, Renchi, Shi, Jieming

Efficient High-Quality Clustering for Large Bipartite Graphs

arXiv.org Artificial IntelligenceDec-28-2023

A bipartite graph contains inter-set edges between two disjoint vertex sets, and is widely used to model real-world data, such as user-item purchase records, author-article publications, and biological interactions between drugs and proteins. k-Bipartite Graph Clustering (k-BGC) is to partition the target vertex set in a bipartite graph into k disjoint clusters. The clustering quality is important to the utility of k-BGC in various applications like social network analysis, recommendation systems, text mining, and bioinformatics, to name a few. Existing approaches to k-BGC either output clustering results with compromised quality due to inadequate exploitation of high-order information between vertices, or fail to handle sizable bipartite graphs with billions of edges. Motivated by this, this paper presents two efficient k-BGC solutions, HOPE and HOPE+, which achieve state-of-the-art performance on large-scale bipartite graphs. HOPE obtains high scalability and effectiveness through a new k-BGC problem formulation based on the novel notion of high-order perspective (HOP) vectors and an efficient technique for low-rank approximation of HOP vectors. HOPE+ further elevates the k-BGC performance to another level with a judicious problem transformation and a highly efficient two-stage optimization framework. Two variants, HOPE+ (FNEM) and HOPE+ (SNEM) are designed when either the Frobenius norm or spectral norm is applied in the transformation. Extensive experiments, comparing HOPE and HOPE+ against 13 competitors on 10 real-world datasets, exhibit that our solutions, especially HOPE+, are superior to existing methods in terms of result quality, while being up to orders of magnitude faster. On the largest dataset MAG with 1.1 billion edges, HOPE+ is able to produce clusters with the highest clustering accuracy within 31 minutes, which is unmatched by any existing solution for k-BGC.

bipartite graph, graph, vertex, (15 more...)

2312.16926

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Washington (0.04)

Genre: Research Report (0.63)

Industry:

Health & Medicine (0.46)
Information Technology > Services (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)